Techniques for logical design and ef fi cient querying of data warehouses
نویسندگان
چکیده
Sommario Logical design of data warehouses (DW) encompasses the sequence of steps which, given a core work-load, determine the logical scheme for the DW. A key step in logical design is view materialization. In this paper we propose an original approach to materialization in which the workload is characterized by the presence of complex queries represented by Nested Generalized Projection/Selection/Join expressions , in which sequences of aggregate operators may be applied to measures and selection predicates may be defined, at different granularities, on both dimensions and measures. Then, we propose a novel approach to estimate the cardinality of views based on a-priori information derived from the application domain. We face the problem by first computing satisfactory bounds for the cardinality, then by determining a good probabilistic estimate for it. The results we present here concern the computation of upper bounds for the cardinality of a view considering a set of cardinality constraints expressed on some other views. Finally, we deal with the problem of populating and refreshing the data warehouse, which typically involves queries spanning several tables over the reconciled schema. We present a structural method, based on the notion of hypertree decomposition, for solving these queries efficiently. Then, we extend this method in order to take into account also quantitative information on the data values.
منابع مشابه
Querying Cardinal Directions between Complex Objects in Data Warehouses
Data warehouses help to store and analyze large multidimensional datasets and provide enterprise decision support. With an increased availability of spatial data in recent years, several new strategies have been proposed to enable their integration into data warehouses and and perform complex OLAP analysis. Cardinal directions have turned out to be very important qualitative spatial relations d...
متن کاملDeveloping a BIM-based Spatial Ontology for Semantic Querying of 3D Property Information
With the growing dominance of complex and multi-level urban structures, current cadastral systems, which are often developed based on 2D representations, are not capable of providing unambiguous spatial information about urban properties. Therefore, the concept of 3D cadastre is proposed to support 3D digital representation of land and properties and facilitate the communication of legal owners...
متن کاملAnalysis and design of approximate queries over XML documents using statistical techniques
In the last few years several repositories for storing XML documents and languages for querying XML data have been studied and implemented. All the query languages proposed so far allow to obtain exact answers, but when applied to large XML repositories or warehouses, such precise queries may require high response times. To overcome this problem, in traditional relational warehouses fast approx...
متن کاملA Proposed Data Mining Methodology and its Application to Industrial Procedures
Data mining is the process of discovering correlations, patterns, trends or relationships by searching through a large amount of data stored in repositories, corporate databases, and data warehouses. Industrial procedures with the help of engineers, managers, and other specialists, comprise a broad field and have many tools and techniques in their problem-solving arsenal. The purpose of this st...
متن کاملCaching for Multi-dimensional Data Mining Queries
Multi-dimensional data analysis and online analytical processing are standard querying techniques applied on today’s data warehouses. Data mining algorithms, on the other hand, are still mostly run in stand-alone, batch mode on flat files extracted from relational databases. In this paper we propose a general querying model combining the power of relational databases, SQL, multidimensional quer...
متن کامل